How many researchers…?

image of a light bulb

The best thing about being a statistician…

John Tukey

Backyard #1

Children's Mercy Hospital building

Backyard #2

Cleveland Chiropractic College building

Backyard #3

MRI Global building

Backyard #4

North Kansas City Hospital building

Backyard #5

Saint Luke's Hospital building

Backyard #6

Truman Medical Center building

My favorite two backyards

UMKC and KUMC mascots

New backyard: Russ Waitman

Russ Waitman

i2b2 software

Screenshot of i2b2 software

The database structure behind i2b2

Diagram of i2b2 schema

How many surgeries?

## How many surgeries?

select_surgeries <- 
  "SELECT name_char FROM blueherondata.concept_dimension
     WHERE name_char LIKE '%ectomy%'"     
dbGetQuery(c_connect, select_surgeries) %>%   # Extract records
  use_series(NAME_CHAR) %>%                   # Convert to vector
  strsplit(" ") %>%                           # Split into words
  unlist %>%                                  # Re-convert to vector
  tolower %>%                                 # Force to lower case
  grep("ectomy", ., value=TRUE) %>%           # Toss extraneous words
  gsub("[[:punct:]]", "", .) %>%              # Remove punctuation
  gsub("ectomy.*", "-", .) %>%                # Remove ectomy suffix
  unique %>%                                  # Remove duplicates
  sample(100, replace=FALSE) %>%              # Select 100 random
  sort %>%                                    # Arrange
  paste(collapse=", ")                        # Delimit with commas

How many surgeries?

"acromion-, adenoid-, alveol-, apic-, apico-, arthr-, arytenoid-, astragal-, ather-, burs-, capsul-, carp-, clitor-, coccyg-, crani-, dacryoaden-, dacryocyst-, diaphys-, disarticulationhemipelv-, disk-, diverticul-, endarter-, epididym-, epiglottid-, epiplo-, ethmoid-, fasci-, fistul-, frenul-, ganglion-, gastr-, gingiv-, gloss-, hemigastr-, hemigloss-, hemilamin-, hemilaryng-, hemiphalang-, hemorrhoid-, hepat-, hymen-, hyster-, infundibul-, irid-, labyrinth-, lamin-, lip-, lump-, mucos-, my-, myom-, nephr-, nephroureter-, oophor-, osteophyt-, pannicul-, patell-, phalang-, pharyngolaryng-, phleb-, pleur-, plex-, pneumon-, postadenoid-, postcholecyst-, postgastr-, postlymphaden-, postmastoid-, postpolyp-, postprostat-, postsplen-, prostat-, rectosigmoid-, salping-, salpingoophor-, scler-, segment-, sequestr-, sialoaden-, sigmoid-, sphenoid-, sympath-, synov-, tenon-, tenosynov-, trabecul-, trachel-, trisection-, trisegment-, turbin-, tyl-, tympanomastoid-, umbil-, urethr-, uvul-, vagin-, valv-, vas-, vesicul-, vulv-"

Mei Liu, Acute Kidney Injury

Mei Liu, next to one of her research publications

<<<<<<< HEAD ======= >>>>>>> 495d4b13430a8f7cba27672caf926040d05f000f

Requirements needed for mining the electronic health record

Technical requirements

  • Working familiarity with SQL
  • Data wrangling skills

Non-technical requirements

  • Lust for data
  • An interesting backyard

Informatics meetup

Flier for May 23 Informatics meetup

Where you can find a copy of this talk.

This presentation was developed using R Markdown. You can find all the important stuff at

In particular, look for

  • doc/mining-v2-image-credits.txt
  • doc/mining-v2-slides.pptx
  • doc/mining-v2-speaker-notes.pdf
======= >>>>>>> 495d4b13430a8f7cba27672caf926040d05f000f